Regularized group regression methods for genomic prediction: Bridge, MCP, SCAD, group bridge, group lasso, sparse group lasso, group MCP and group SCAD
نویسندگان
چکیده
BACKGROUND Genomic prediction is now widely recognized as an efficient, cost-effective and theoretically well-founded method for estimating breeding values using molecular markers spread over the whole genome. The prediction problem entails estimating the effects of all genes or chromosomal segments simultaneously and aggregating them to yield the predicted total genomic breeding value. Many potential methods for genomic prediction exist but have widely different relative computational costs, complexity and ease of implementation, with significant repercussions for predictive accuracy. We empirically evaluate the predictive performance of several contending regularization methods, designed to accommodate grouping of markers, using three synthetic traits of known accuracy. METHODS Each of the competitor methods was used to estimate predictive accuracy for each of the three quantitative traits. The traits and an associated genome comprising five chromosomes with 10000 biallelic Single Nucleotide Polymorphic (SNP)-marker loci were simulated for the QTL-MAS 2012 workshop. The models were trained on 3000 phenotyped and genotyped individuals and used to predict genomic breeding values for 1020 unphenotyped individuals. Accuracy was expressed as the Pearson correlation between the simulated true and the estimated breeding values. RESULTS All the methods produced accurate estimates of genomic breeding values. Grouping of markers did not clearly improve accuracy contrary to expectation. Selecting the penalty parameter with replicated 10-fold cross validation often gave better accuracy than using information theoretic criteria. CONCLUSIONS All the regularization methods considered produced satisfactory predictive accuracies for most practical purposes and thus deserve serious consideration in genomic prediction research and practice. Grouping markers did not enhance predictive accuracy for the synthetic data set considered. But other more sophisticated grouping schemes could potentially enhance accuracy. Using cross validation to select the penalty parameters for the methods often yielded more accurate estimates of predictive accuracy than using information theoretic criteria.
منابع مشابه
Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors
Penalized regression is an attractive framework for variable selection problems. Often, variables possess a grouping structure, and the relevant selection problem is that of selecting groups, not individual variables. The group lasso has been proposed as a way of extending the ideas of the lasso to the problem of group selection. Nonconvex penalties such as SCAD and MCP have been proposed and s...
متن کاملRegularized methods for high-dimensional and bi-level variable selection
Many traditional approaches to statistical analysis cease to be useful when the number of variables is large in comparison with the sample size. Penalized regression methods have proved to be an attractive approach, both theoretically and empirically, for dealing with these problems. This thesis focuses on the development of penalized regression methods for high-dimensional variable selection. ...
متن کاملA group bridge approach for variable selection.
In multiple regression problems when covariates can be naturally grouped, it is important to carry out feature selection at the group and within-group individual variable levels simultaneously. The existing methods, including the lasso and group lasso, are designed for either variable selection or group selection, but not for both. We propose a group bridge approach that is capable of simultane...
متن کاملThe group exponential lasso for bi-level variable selection.
In many applications, covariates possess a grouping structure that can be incorporated into the analysis to select important groups as well as important members of those groups. One important example arises in genetic association studies, where genes may have several variants capable of contributing to disease. An ideal penalized regression approach would select variables by balancing both the ...
متن کاملPenalized methods for bi-level variable selection.
In many applications, covariates possess a grouping structure that can be incorporated into the analysis to select important groups as well as important members of those groups. This work focuses on the incorporation of grouping structure into penalized regression. We investigate the previously proposed group lasso and group bridge penalties as well as a novel method, group MCP, introducing a f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 8 شماره
صفحات -
تاریخ انتشار 2014